Parallel Directionally Split Solver Based on Reformulation of Pipelined Thomas Algorithm

نویسنده

  • A. Povitsky
چکیده

A very efficient direct solver, known as the Thomas algorithm, is frequently used for the solution of band matrix systems that typically appear in models of mechanics. The processor idle time is a reason for the poor parallelization efficiency of the solvers based on the pipelined parallel Thomas algorithms. In this research an efficient parallel algorithm for 3-D directionally split problems is developed. The proposed algorithm is based on a reformulated version of the pipelined Thomas algorithm that starts the backward step computations immediately after the completion of the forward step computations for the first portion of lines. This algorithm has data available for other computational tasks while processors are idle from the Thomas algorithm. The proposed 3-D directionally split solver is based on the static scheduling of processors where local and non-local, data-dependent and data-independent computations are scheduled while processors are idle. A theoretical model of parallelization efficiency is used to define optimal parameters of the algorithm, to show an asymptotic parallelization penalty and to obtain an optimal cover of a global domain with subdomains. It is shown by computational experiments and by the theoretical model that the proposed algorithm reduces the parallelization penalty about two times over the basic algorithm for the range of the number of processors (subdomains) considered and the number of grid nodes per subdomain. Key words, parallel computing, parallelization model, directionally split methods, pipelined Thomas Algorithm, banded matrices, ADI and FS mcthods Subject classification. Computer Science

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient implementation of low time complexity and pipelined bit-parallel polynomial basis multiplier over binary finite fields

This paper presents two efficient implementations of fast and pipelined bit-parallel polynomial basis multipliers over GF (2m) by irreducible pentanomials and trinomials. The architecture of the first multiplier is based on a parallel and independent computation of powers of the polynomial variable. In the second structure only even powers of the polynomial variable are used. The par...

متن کامل

Numerically Stable Variants of the Communication-hiding Pipelined Conjugate Gradients Algorithm for the Parallel Solution of Large Scale Symmetric Linear Systems

By reducing the number of global synchronization bottlenecks per iteration and hiding communication behind useful computational work, pipelined Krylov subspace methods achieve significantly improved parallel scalability on present-day HPC hardware. However, this typically comes at the cost of a reduced maximal attainable accuracy. This paper presents and compares several stabilized versions of ...

متن کامل

Optimal fast digital error correction method of pipelined analog to digital converter with DLMS algorithm

In this paper, convergence rate of digital error correction algorithm in correction of capacitor mismatch error and finite and nonlinear gain of Op-Amp has increased significantly by the use of DLMS, an evolutionary search algorithm. To this end, a 16-bit pipelined analog to digital converter was modeled. The obtained digital model is a FIR filter with 16 adjustable weights. To adjust weights o...

متن کامل

A Message-Passing Distributed Memory Parallel Algorithm for a Dual-Code Thin Layer, Parabolized Navier-Stokes Solver

In this study, the results of parallelization of a 3-D dual code (Thin Layer, Parabolized Navier-Stokes solver) for solving supersonic turbulent flow around body and wing-body combinations are presented. As a serial code, TLNS solver is very time consuming and takes a large part of memory due to the iterative and lengthy computations. Also for complicated geometries, an exceeding number of grid...

متن کامل

Fuzzy Programming for Parallel Machines Scheduling: Minimizing Weighted Tardiness/Earliness and Flow Time through Genetic Algorithm

Appropriate scheduling and sequencing of tasks on machines is one of the basic and significant problems that a shop or a factory manager encounters; this is why in recent decades extensive studies have been done on scheduling issues. One type of scheduling problems is just-in-time (JIT) scheduling and in this area, motivated by JIT manufacturing, this study investigates a mathematical model for...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998